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Abstract 

We study a two-player, zero-sum, dynamic game with incomplete information where 
one of the players is more informed than his opponent. We analyze the limit value as the 
players play more and more frequently. The more informed player observes the realization 
of a Markov process (A, Y) on which the payoffs depend, while the less informed player 
only observes Y and his opponent’s actions. We show the existence of a limit value as 
the time span between two consecutive stages goes to zero. This value is characterized 
through an auxiliary optimization problem and as the unique viscosity solution of a second 
order Hamilton-Jacobi equation with convexity constraints. 
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1 Introduction. 

This paper contributes to the literature on zero-sum dynamic games with incomplete infor¬ 
mation, by considering the case where one player is always more informed than his opponent. 

A key feature appearing in recent contributions to the field of zero-sum dynamic games is 
the interplay between discrete-time and continuous-time dynamic models, as in Cardaliaguet- 
Laraki-Sorin [10], Neyman [27] or Cardaliaguet-Rainer-Rosenberg-Vieille [9], where the au¬ 
thors consider sequences of discrete-time dynamic games in which the players play more and 
more frequently. Such an analysis is related to the study of a sequence of discretizations in 
time of a given continuous-time dynamic game. In the present work, we adopt this method 
in order to study a continuous-time zero-sum dynamic game where one player is always more 
informed than his opponent and where the state variable evolves according to an exogenous 
Markov process. Precisely, we consider a model with two payoff-relevant variables (Xt,Yt)t >o 
which are evolving over time: A is a Markov chain with finite state space and Y is a diffusion 
process whose drift parameter depend on the current value of X. The process X is privately 
observed by the more informed player (say player 1) while Y is publicly observed, allowing the 
less informed player (player 2) to learn information about the variable X during the game. We 
analyze the sequence of discrete-time games indexed by n > 1 with incomplete information 
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and perfect observation of actions, where stages occur at times ^ for q > 0. At each stage, 
player 1 observes a pair of signals (1?, Yq ) while player 2 only observes Y±. The stage payoff 

n n n 

function is assumed to depend on actions of both players and on (Xq,Yq_). The global payoff 

n n 

is a discounted sum of the stage payoffs with discount factor \ n = f^ n re~ rt dt = 1 — e -r / n , 
where r > 0 is a given continuous-time discount rate. We assume that the stage payoffs are 
not observed and we study the limit value of these games as the players play more and more 
frequently. 

We provide two characterizations for the limit value of these games as n goes to infinity. 
The first one is a probabilistic representation formula where the optimization variable is the 
set of admissible belief processes for the less informed player. Such a formula already appears 
in Sorin [28] as an illustration of the classical Cav(u ) theorem of Aumann and Maschler 
[1]. A similar discrete-time formula was introduced by De Meyer in [12] in order to obtain 
a continuous-time limit value in a class of financial games and this approach led to several 
extensions in continuous-time models (see Cardaliaguet-Rainer [7, 8], Grim [19, 20], Gensbittel 
[16, 17], and more recently Cardaliaguet-Rainer-Rosenberg-Vieille [9] and Gensbittel-Griin 
[18]). This representation formula is important as it provides a characterization of optimal 
processes of revelation (martingales of posteriors induced by optimal strategies). 

The second one is a variational characterization, the limit value is shown to be the unique 
viscosity solution of a second-order Hamilton-Jacobi equation with convexity constraints as 
introduced by Cardaliaguet [4, 5] and generalized in Cardaliaguet-Rainer [6], Grim [19], 
Cardaliaguet-Rainer-Rosenberg-Vieille [9] and Gensbittel-Griin [18]. 

2 Main results. 

Notation 2.1. For any topological space E, A (E) denotes the set of Borel probability distri¬ 
butions on E endowed with the weak topology and the associated Borel a-algebra. 6 X denotes 
the Dirac measure on x £ E. Finite sets are endowed with the discrete topology and Cartesian 
products with the product topology. O([0, oo), E) denotes the set of cadlag trajectories taking 
values in E, endowed with the topology of convergence in Lebesgue measure. The notations 
(,) and |.| stand for the canonical scalar product and the associated norm inW n . 

Let us at first describe the continuous-time game we will approximate. This description 
is incomplete as we do not define strategies in continuous-time. Rather, we define below 
strategies in the different time-discretizations of this game. The notion of value for this game 
will therefore be the limit value along a sequence of discretizations when the mesh of the 
corresponding partitions goes to zero. 

We assume that (-Vt)tg[o,oo) is a continuous-time homogeneous Markov chain with finite state 
space K, infinitesimal generator R = ( Rk,k')k,k'eK and initial law p € A (K). We identify 
A (K) with the canonical simplex of i.e.: 

A (K) = {pe € K,p(k) >0, ^2p(k) = 1}. 

keK 

Then, we define the real-valued process (Vt)te[ 0 ,oo) as the unique solution of the following 
stochastic differential equation (SDE) 

Vi > 0, Y t = y + [ b{X s , Y s )ds + [ a(Y s )dW s , (2.1) 

Jo Jo 
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where ( Wt)te[o ,oo) is a standard Brownian motion independent of X and y £ M is a given 
initial condition. The process Y may be seen as some noisy observation of the process X. 

We assume that the functions b and a in (2.1) are bounded and Lipschitz, and that there 
exists e > 0 such that for all y £ R, a(y) > e. The state process Z := (X,Y) with values 
in K x K is a well defined Feller Markov process, with semi-group of transition probabilities 
denoted (Pt)t> o- 

Let /, J denote finite action sets for the two players (players 1 and 2), g : (K xl)xlx J —»• R 
a bounded payoff function which is Lipschitz with respect to the second variable, and r > 0 
a fixed discount rate. 

We consider the following (heuristic) zero-sum game, played on the time interval [0, oo): 

• Player 1 observes the trajectory of Z = (X,Y). 

• Player 2 observes only the trajectory of Y. 

• They play the game G(p,y) with total expected payoff for player 1: 

r+oo 

E[ re~ rt g(X t ,Y t ,i t ,j t )dt \, 

J o 

where it (resp. jt) denote the action of player 1 at time t (resp. of player 2). 

• Actions are observed during the game (and potentially convey relevant information). 

We aim at studying the value function of this game and how information is used by the 
more informed player when playing optimally. In order to achieve this goal, we introduce a 
sequence of time-dicretizations of the game. For simplicity, and without loss of generality, let 
us consider the uniform partition of [0,+oo) of mesh 1/n. The corresponding discrete-time 
game, denoted G n (p,y) proceeds as follows: 

• The variable Zq = (Xq_,Yq_) is observed by player 1 before stage q for q > 0. 

n n n 

• The variable Y<± is observed by player 2 before stage q for q > 0. 

n 

• At each stage, both players choose simultaneously a pair of actions ( i q ,j q ) £ I x J. 

• Chosen actions are observed after each stage. 

• Stage payoff of player 1 equals g(Zq,i q ,j q ) (realized stage payoffs are not observed). 

• The total expected payoff of player 1 is 


E 


^ ] (1 ^n) q 9(Z i, iq, jq) 

g>0 


with A„ = 1 — e r ! n 


Remark 2.2. When a is constant and b depends only on X, the observation of player 2 
correspond to a normally distributed random variable with mean f ± n b(X s )ds and variance 

n 

2 

—. It may therefore be interpreted as a noisy observation of X. 
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The description of the game is common knowledge and we consider the game played in be¬ 
havior strategies: at round q, player 1 and player 2 select simultaneously and independently 
an action i q € / for player 1 and j q € J for player 2 using some lotteries depending on their 
past observations. 

Formally, a behavior strategy a for player 1 is a sequence (cr q ) q >o of transition probabilities: 

<7 q : ((K x R) x I x J) q x (K xM)-> A (I), 

where a q (Zg, in , in . Z q ~i , i ? _i, 7 q _i, Zq) denotes the lottery used to select the action i q 

n n 

played at round q by player 1 when past actions played during the game are (in-jo ,..., i q ~i, j q -i) 
and the sequence of observations of player 1 is (Zo, ...,Zq_). Let E denote the set of behavior 

n 

strategies for player 1. Similarly, a behavior strategy r for player 2 is a sequence (r 9 ) g >o of 
transition probabilities depending on his past observations 

T q : (R x I x J) q x K A (J). 

Let T denote the set of behavior strategies for player 2 . 

Let P (n,p,y,a,T) £ A(B([0, oo), K xf) x (I x J) N ) denote the probability on the set of trajectories 
of Z and actions induced by the strategies a,r. The payoff function in G n (p,y) is defined by 

n } ~ -^n) 9 9{Z3 -, i q , jq) 

q> 0 

It is well known that the value of the game exists, i.e. 

Vn{p,y) ■= sup inf 7 n {p,y,a,T) = inf sup 7 re (p, y, a, r). 
o-eS reT tGT crGS 

We also need to consider the value function u of the non-revealing one-stage game T (p, y ), 
which is a finite game with payoff g in which player 1 cannot use his private information. 
Precisely, 

u(p,y) := sup inf V V V p{k)a(i)r{j)g(k, y, i, j), 

<tgA(/) tgA(J) iGl kGK 

and the value exists (i.e. the sup and inf commute in the above formula) as it is a finite game. 
It follows from standard arguments that u is Lipchitz in (p,y). 

The main results proved in sections 3 and 4 are two different characterizations for the 
limit of the sequence of value functions V n . 

Let us now introduce some notations. 

Notation 2.3. 

• The natural filtration T A of a process (At)te[o,oo) defined by T A = a(A s ,s < t). The 
associated right-continuous filtration is denoted J :A,+ with J~i" + := P\ s >td~ A ■ 

• For any topological space E, D([0, oo),E) denotes the set of E-valued cadlag trajectories. 

• For all ( p,y) € A (K) x M, P Pi2/ € A(B([0, +oo), K x M)) denotes the law of the process 
Z = (A, Y) with initial law p <S> 5 y . 


■ Pp(n,p,j/,ir,r) 
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Our first main result is the following probabilistic characterization. 

Theorem 2.4. For all ( p,y) € A (K) x R, 

r+oo 

V n (p,y) — >V(p,y):= max E[ / re~ rt ufn t , Y t )dt], (2.2) 

n ^°° (Zi,7T t ) t > 0 eS(p,y) Jo 

where B(p,y) C A(1D)([0, oo), (K x R) x A (K))) denotes the set of laws of cadlag processes 
o,oo) such that. 

• (Z t ) t >o has law F P:y and is an J r ( z > 7r ) -Markov process. 

• For all t > 0, for all k G K, irt(k) = R(W = k]^ 77 ^ 

Let us comment briefly this result. We generalize here the idea that the problem the informed 
player is facing can be decomposed into two parts: at first he may decide how information 
will be used during the whole game, and then maximize his payoff under this constraint. To 
apply this method of decomposition, we need to identify precisely the set B(p, y) of achievable 
processes of posterior beliefs on X of the less informed player. The filtration ) represents 
the information of player 2, which observes the process Y (a lower bound on information). The 
condition that Z is -Markov reflects the fact that player 2 cannot learn any information 

on the process X which is not known by player 1 (an upper bound on information) and 
the second condition simply says that ir represents the process of beliefs of player 2 on Xt. 
Maximizers of the right-hand side of equation (2.2) represent optimal processes of revelation 
for the informed player and induce asymptotically optimal strategies for the informed player 
in the sequence of discretized games (see the proof of Theorem 2.4). 

We now turn to the second characterization. Define b(y) := ( b(k,y))k£K €E R A and for all 
k G K and t > 0, define the optional projection 1 : 

Xt(k) :=F(X t = k\F? + ). 

Using Theorem 9.1 in [24] (see also the Note p.360 about the Markov property), the process 
if := (x, Y) with values in M A x M is a diffusion process satisfying the following stochastic 
differential equation: 

Vi > 0, if t = ifo + / c(ip s )ds+ [ K(ip s )dW s , (2.3) 

Jo Jo 

where W is a standard F' 1 ,+ -Brownian motion and the vectors c(p, y ) and n(p, y ) in R^ +1 = 
M a x R are defined by 

c(p,y) ■= ( T Rp, (p,b(y ))), 

«(p,y) : = ~ (b(y),p)))kei<,cr(y)), 

where T R denotes the transpose of the matrix R and probabilities are seen as column vectors. 
We deduce from standard properties of diffusion processes that for any function / € C 2 (R A+1 ) 
with polynomial growth (say) and for all 0 < s < t: 

nf(A) \j=t] = m,)+nf' A(f)(ip u )du\Ff] 

1 In all the proofs, we consider only natural or right-continuous filtrations, but we adopt the same convention 

as in [21] and do not complete the filtrations to avoid complex or ambiguous notations. Note that optional 
projections of cadlag processes are well-defined and have almost surely cadlag paths (see appendix 1 in [13]). 
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where A(f) is the differential operator defined by (using the notation z = ( p,y )) 

Af(z) = ( Df(z),c(z )} + n(z),D 2 f(z)n(z )). 

In order to state our second main result, we need to define precisely the notion of weak 
solution we will use. Let p £ A(K), we define the tangent space at p by 

T A (k)(p) '■= { x e | > 0,p + sx,p — ex £ A(IL)}. 

Let S m denote the set of symmetric matrices of size m. For S £ S K and p £ A(K), we define 

Amax(p, S ) := max{^| x £ T A{K) (p) \ {0}| 

and by convention A max (p, X) = —oo whenever T AIK ^(p) = {0}. 

Theorem 2.5. V is the unique continuous viscosity solution of 

min {rV + H(z, DV(z ), D 2 V(z )); -A max (p, D 2 p V(z))} = 0 (2.4) 

where for all (z,£,S) £ (A (K) x M) x R A+1 x S K+1 : 

H(z,£, S) := ~(£,c(z)) - k(z),Sk(z )) - ru(z), 

and where DV,D 2 V denote the gradient and the Hessian matrix ofV and D 2 V(z) the Hessian 
matrix of the function V with respect to the variable p. 

Let us recall the definitions of sub and super-solutions. 

Definition 2.6. We say that a bounded lower semi-continuous function f is a (viscosity) 
supersolution of the equation (2.4) on A(K) xR if for any test function cf, C 2 in a neighborhood 
of A(K) x R (in R A x Rj such that <f> < f on A (K) x R with equality in ( p,y ) £ A (K) x R, 
we have 

A ma *(p, D%(t>(p, y)) < 0 and rcffp, y) - A((f)(p, y) - ru(p, y) > 0. 

We say that a bounded upper semi-continuous function f is a (viscosity) subsolution of the 
equation (2.4) on A (K) x R if for any test function 4>, C 2 in a neighborhood of A(K) x R (in 
R a x RJ such that <f > / on A(K) x R with equality in ( p,y) £ A(K) x R, we have 

A max (p, D 2 p (f(p , y)) < 0 => y ) - A(<j>)(p, y) - ru(p, y) < 0. 

The proof of Theorem 2.5 is based on theorem 2.4 and on dynamic programming. 

2.1 Possible extensions and open problems. 

We list below miscellaneous remarks. 

• In comparison to [9], in the statement of Theorem 2.4, we maximize over a set of joint 
distributions (Z, n) rather than on the set of induced distributions for (n, Y), which are 
the only relevant variables for the computation of the objective functional. The latter 
set of distributions is exactly the set of joint laws of cadlag processes (it,Y) such that 
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for all bounded continuous function on A (K) x R which are convex with respect to 
the first variable, we have: 

VO <s<t, E[cf>(Tr t ,Y t )\F^ Y) } > Q t _ s ^)(n s ,Y s ), 

where Q is the semi-group of the diffusion process if. We do not prove this claim but it 
follows quite easily from Strassen’s Theorem and the same techniques used in Lemma 
4 in [9] and Lemma 5.11 in [18]. However, such a proof would not be constructive 
(due to Strassen’s theorem) and therefore, we do not think that this result would be 
more interesting stated this way. Indeed, in order to construct asymptotically optimal 
strategies following the proof of Theorem 2.4, player 1 has to compute the joint law of 
(Z, it) anyway (precisely the conditional law of n given Z at times q/n for q > 0). 

• One may generalize all the present results for the lower value functions to the case of 
infinite actions spaces I, J (even if the value u does not exist) by adapting the method 
developed in [16]. Note that the proof of the same kind of results for the upper value 
functions may rely on different tools as shown in [16], and that the extension of these 
results in the present model remains an open question. 

• It can be shown directly (with classical arguments) that the functions V n and V are 
continuous. However, this does not simplify nor shorten the proofs. 

• It is reasonable to think that Theorem 2.4 can be extended to the case of a more general 
Feller processes (X, Y), at least for diffusions with smooth coefficients. However, such an 
extension leads to the following open question: is it possible to write an Hamilton-Jacobi 
equation in the case of a diffusion process Z = (X , Y) taking values in R m x R p ? Note 
that such an equation would be stated in an infinite dimensional space of probability 
measures. 

• It would be interesting to try to find explicit solutions for simple examples with two 
states for X and with simple payoff functions and simple diffusion parameters for Y. 
Such an analysis and the comparison with the examples studied in [9] is left for future 
research. 


3 Proof of Theorem 2.4 

Recall the definition of conditional independence. 

Definition 3.1. Let (12,4, P) a probability space and F,G,LL three sub a-fields of A. We say 
that T and Q are conditionally independent given LL if 

VF eJ,VGe G, P (F n G\H) = V(F\U)V(G\U). 

This relation is denoted T Q and the definition extends to random variables by considering 
the a-fields they generate. 

The next definition is related to the characterization of the Markov property in terms of 
conditional independence and will be useful in the sequel. 

Definition 3.2. Given two random processes (A q , B q ) q >o (with values in some Polish spaces) 
defined on (12,4, P). We say that (A q ) q >q is non-anticipative with respect to (B q ) q >o if 

V<? > 0, (Ho,...,Aj) (B m ) m > o- 

Bo,...,Bq 
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The next result is a classical property of conditional independence and its proof is post¬ 
poned to the appendix. 

Lemma 3.3. Given two random processes (A qi B q ) q >o (with values in some Polish spaces), 
the process ( A q ) q >o is non-anticipative with respect to (B q ), ? >o if and only if there exists 
(on a possibly enlarged probability space) a sequence of independent random variables ( £, q )q>o 
uniformly distributed on [0,1] and independent of (B q ) q > o, and a sequence of measurable 
functions f q (defined on appropriate spaces) such that for all q > 0 

^4 q = f q {B m , £ m , m < q). 

The proof of Theorem 2.4 is divided in two steps and relies on the technical Lemma 3.8, 
whose proof is postponed to the next subsection. 

Step 1: We prove that liminf V n > V. 

Let cr*(p,y) and r*{p,y) be measurable selections of optimal strategies for player 1 and 2 
respectively, in the game T{p,y) with value u(p,y). 

We start with a continuous-time process (Z t ,irt)t>o hi B(p,y). We consider the discrete-time 
process (Zz, nz )„> 0 . Using the Markov property at times we deduce that ( 7 Ti)„>o is non- 
anticipative with respect to (Zi) q >o. We now construct a strategy a in G n (p,y) depending 
on the process (Z, ir). Using the conditional independence property (see Lemma 3.3), there 
exists a sequence (£ q ) q >o of independent random variables uniformly distributed on [ 0 , 1 ] and 
independent from (Zz) q > o, and a sequence of measurable functions (f q ) q > o such that 


TTZ = fq((Zm,£m)m<q) for all q > 0 . 

n n 

We define player l’s strategy a as follows: 

a q (Zo,...,Z !L ,Zo,..,£q) '■= cr*(irz,Yz). 

n n n 

This does not define formally a behavior strategy but these transition probabilities induce 
a joint law for ( Zz,i q ) q >o which can always be disintegrated in a behavior strategy (that 
does not depend on player 2 ’s actions) since the induced process (i q ) q > o is by construction 
non-anticipative with respect to (Zq ) q >o (using again Lemma 3.3). By taking the conditional 
expectation given (Ye_, ire_,i£, ji)e=o,..., q , the payoff at stage q against any strategy r is such 

n n 

that: 


\.9(Xq., 4 i, i q , jq)] — lE n p a T [\ ' irq_(k')g(k,Yq_,i q ,j q )] > p a T [u^ttz , Yz )]• 

k&K 

Therefore, a is such that 

V n (p,y) > inf 7 n (p,y,a,T) > V' A n (l - X n ) q E[u(Trz , Yz)}. 

^ n n 

q > o 

Define (■7r n , Z n ) as the piecewise-constant process equal to (Z,tt) at times ^ for q > 0. Then 
fik n ,Z n ) converges in probability to (Z,tt) (see e.g. Lemma VI. 6 .37 in [21]) and therefore 


_ roo roc 

Y' A n (l - \ n ) q ¥,[u(Kq,Yq)\ = E[ / re _ri u( 7 f”, Y t n )dt] —)-E[ / re~ rt u(iT t , Y t )dt] 

z ^ n n n n— >00 / n 

q> 0 J{J 


As (Z,7r) € B(p,y) was chosen arbitrarily, we deduce that: 


lim inf V n (p, y) > V(p, y) 

n —>-00 


Step 2: We prove that lim sup V n < V. 

Let us fix (p, y) and let (e n )n> 1 a positive sequence going to zero. For all n > 1, let a n be 
an e n -optimal behavior strategy for player 1 in G n (p,y). We will construct a strategy r n for 
player 2 by induction such that for all q > 0 the expected payoff at round q is not greater 
than 

E n,p,y,<r rl ,T rl I'U'iftqi Ut) “I - C\pq pq 1 1 ], (3d) 

for some constant C independent of n, where |.|i denotes the £ i-norm and where for all q > 0, 
p q and pq denote respectively the conditional laws of Xq given the information of player 2 

’ ’ # n 

before and after playing round q. Precisely, for all k G K: 

Pq(J^) • ^ (n,p,y,(T ri ,r n )(X3_ k \ Yq, i() : Jo 5 •••? , iq— 1, jq— 1, Pk) ? 

n n n 

Pq{£') • P(7i,p,y,c7 n ,T rl ) (AE ^ I ^OdO) JO) ■■■jX 3., l q , jq) . 

Note that the computation of p q does not depend on r™. We can therefore define by induction 
t™ := T*(p Q ,Yq_). Then, inequality (3.1) follows directly from Lemmas V.2.5 and V.2.6 in [251. 
We now suppress the indices ( n,p,y,cr n ,T n ) from the probabilities and expectations. Using 
that u is Lipschitz with respect to p, we have 


E[u(pq,Ya_) + ClPq-pqh] < E[u(j> q , Y±) + 2C\pq-p q \l] 

Define also: 

Pq+ 1 :=P(X 2+1 = k\Y 0 ,io,jo,-,Y3.,iq,jq) = (e« ^Pq) (k). 

n n 

Note that for all q > 0, the sequence (p q+ i,pq + i,p q+ i) is a martingale so that using Jensen’s 
inequality. 

E[(pq + 1 ) 2 }<n(p q+1 ) 2 }. 

On the other hand, using the previous equality, we can choose the constant C so that almost 
surely 

(j 

V<? > 0, \p q+ 1 -p q | < —. 

n 

Mimicking the proof of [9], we have 


\ n (l-\ n ) q \p q 

q> 0 


Pq\l] = EE A n (l - \ n ) q E[\p q (k) -Pq{k)\\ 

keK q >0 


s E 

k&K 

= E 

keK 


\ 1/2 

^2 A„(l - X n ) q E[\p q (k) - pq{k) I 2 ] | 

q>0 ) 

\ 1/2 

y^A n (l — X n ) q E[(p q (k)) 2 — (Pq(k)) 2 ] | 

?>0 / 
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which is also equal to 


1/2 


E E A ”( 1 “ X nmPq ( k )) 2 ~ (Pq+l ( k )) 2 + (Pq+ l ( k )) 2 ~ (Pq+ l{ k )f + {Pq+l{ k )f ~ ( Pq( k )) 2 ] 

k&K \q >0 


and therefore is bounded from above by 


1/2 


E E - X n) g mp q+ l(k )) 2 - {p q {k)) 2 } + — 1 < K(X n + — )V2 

k&K \q >0 


We proved that: 


2 f 

V n (p, y) < E An(l ~ A n )^E[u(vr,, Y,)] + A(A n + — ) 1 ' 2 + e r 

* ■» n n Jl 

q> 0 


In order to conclude the proof, we consider the continuous-time process ( Z n ,TT n ) which is 
piecewise-constant and equal to (Zq_,p q ) at times q/n. Let us at first extract a subsequence 
of V n (p,y) which converges to limsup V n (p, y). Then, using Lemma 3.8, there exists a further 
subsequence of ( Z n ,Tx n ) which converges in law to some process (Z,n) in B(p,y). We have 
therefore along this subsequence 


E Ar 

q> 0 


(1-A n )' ? E[u(vr,,y i )] = 


r 

E[ / 


re~ rt u( 7f t n , Y t n )dt] —>■ E[ / re~ rt u{n t , Y t )dt\ 


so that 


lim sup V n 

n—>oo 



re rt u(TT t ,Y t )dt\ < V(p,y). 


3.1 A technical Lemma 

In reference to the paper of Meyer and Zheng [26], we will denote MZ the following topology 
on the set of cadlag paths. 

Notation 3.4. For a separable metric space the MZ-topology on the set B([0, oo), E) 

of cadlag functions is the topology of convergence in measure when [0, oo) is endowed with the 
measure e~ x dx. The associated weak topology over the set A(ID([0, oo), E)) when B([0, oo), E) 
is endowed with the MZ-topology will be denoted C(MZ). 

Remark 3.5. In contrast to the Skorokhod topology (Sk hereafter), if E = F x F' is a product 
of separable metric spaces, the MZ topology is a product topology, i.e. (as topological spaces) 

(D([0, oo), F x F'),MZ) = (B([0,oo ),F),MZ) x (B([0,oo ),F'),MZ). 

The following remark will be used in the proofs. 

Remark 3.6. If E is a Polish space, the space (B([0, oo), E), MZ) is a separable metric 
space which is not topologically complete. However, its Borel a-algebra is the same as the 
one generated by the Sk topology and its topology is weaker than the Sk topology for which 
the space is Polish, implying that all the probability measures are MZ-tight. Therefore, all 
the results about disintegration and measurable selection usually stated for Polish spaces and 
which depend only on the Borel structure apply to this space. 
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Recall that the transition probabilities of Z are denoted (Pt)t> 0 i he. f° r an y bounded mea¬ 
surable function <f on K x R, we have 


P t mz) := E z [<t>{Z t )\ = J (j>dP t (z), 


and that P is a Feller semi-group implying that (z,t) —> Pt(fi)(z) is continuous for any 
bounded continuous function < j>. 


Notation 3.7. Given a process (Z t ) t€ [ o,oo) of law P p>y , 
D([0, oo), K x R) by 

Vt > 0, Zf := Z [ntj 


we define the process (Z t n ) tg [ 0iOO ) £ 


where |_aj denotes the greatest integer lower or equal to a. 


Lemma 3.8. Let ( p,y ) be given, and let us consider a sequence of cadlag processes ( Z n ,ir n ) 
that are piecewise constant on the partition {[^, 2 7y-)} (? >o and such that 

• Z n has the same law as Z n (see the above notation). 

• (7r a.)q>o is non-anticipative with respect to (Z?)„>o. 

• For all t > 0, for all k € K, it?( k) = F(X? = k\ J=^ n ' yn) ). 


Then, the sequence ( Z n ,Tr n ) admits an C(MZ)-convergent subsequence and all the limit points 
belong to B(p, y ). 


Proof. Let Q n denote a sequence of laws of processes (Zf ,i rf )i g [ 0)OO ). It follows from Proposi¬ 
tion VI.6.37 in [21] that Z n £(MZ)-converges to Z of law P v<y . On the other hand, Theorem 4 
in [26] together with a diagonal extraction implies that the set of possible laws for {Zf , 7r”)t>o 
is MZ -relatively sequentially compact, and we may extract some convergent subsequence 2 . 

Let us now prove that the limit belongs to B(p , y ). Assume (without loss of generality) that the 
sequence of processes [Z]),ir]))t>o C{MZ )-converges to (Z t ,irt)t>o- Note at first that the law 
of (Zt)t >o is P Pt y since the projection of the trajectories on the first coordinate is continuous 
(see Remark 3.5). 

Using Skorokhod’s representation Theorem for separable metric spaces (see Theorem 11.7.31 
in [14]), we can assume that the processes are defined on the same probability space and that 
(Zp,n?) t > 0 (Z t ,7Tt)t>o almost surely. Up to extracting a subsequence, we can also assume 
that there exists a subset I of full measure in [0, oo) (i.e. fj e~ x dx = 1) such that for all f G I, 
(Z”, 7r”) —> (. Z t ,irt ) almost surely. 

We now prove that for all t > 0 and all k € K 

TT t (k)=P(X t = k\ri Y ’ n) ). 


For any t £ I, any finite family (t\,...,t r ) in / D [0 ,t] and any bounded continuous function cj) 
defined on (R x A (K)) r , we have 


HW(k) - lxn=k)<l>(Y£, < ,...,Y t n r , 7T t n r )] = 0. 

2 Precisely, for all T > 0 we may first apply this result to each coordinate of the processes (Z™ AT ,Tr™/\ T )t>o- 
Then, since convergent sequences are tight (see Theorem 11.5.3 in [14] and remark 3.6), we apply Lemma A.3 
to deduce that the set of laws {Q n , n > 1} is tight. Applying the direct part of Prohorov’s theorem, which is 
valid for separable metric spaces, we may extract some convergent subsequence. 
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It follows by bounded convergence that 


E[(ttj - 1 x t =k)H Y ti,^t 1 , -,Yt r ,Tr tr )\ = 0. 


We deduce that 

7 T t {k) = P(X t = k\jf’ V) ). 

Given an arbitrary t, we take a decreasing sequence in I with limit t and applying Lemma 
A.6 (see apendix), we obtain: 

7T t (k) = P(X t = k\^’ + ), 


which implies the result using the tower property of conditional expectations. 

It remains to prove the Markov property. Let t\ < ... < t m < s < t in I, and (/>,(/)' some 
bounded continuous functions defined on ((K x R) x A (K)) m and K x M, we claim that 

E [<!>'{Z t )<!>{Z tl , ir tl , Z tm ,TT trn )} = E[P t _ s (0')(Z s )^(Z tl , ir tl , Z tm , 


Indeed, for all n, we have 


e mzrwzz 


) ^tii 


...,z t n 

' um 


, vrL)] = E[ P L „tj- LnsJ (^) (Z?)$(ZZ , t r£ , 


...,zr 

/ L m 


7T + 


and the conclusion follows by bounded convergence. The property extends to arbitrary fi < 
• < t m < s < t by taking decreasing sequences in I and we conclude as above that Z is an 

Markov process. □ 


Let us end this section with a second technical lemma whose proof is similar to Lemma 3.8. 

Lemma 3.9. The set-valued map ( p,y ) —> (B(p,y),C(MZ)) has a closed graph with compact 
values. 

Proof. Since Z is a Feller process, the map ( p,y) F PtV is £(S'A:)-continuous (hence C(MZ)~ 
continuous, see e.g. [15]). We omit the rest of the proof as it follows exactly from the same 
arguments as Lemma 3.8 with obvious modifications. □ 


4 The variational characterization 

We state at first some properties of the function V. 

Proposition 4.1. V is upper-continuous and for all y € M, p V(p,y ) is concave on A (K). 

Proof. That V is upper semi-continuous follows directly from Lemma 3.9. 

Concavity follows from the same method as the well-known splitting Lemma (see e.g. Chapter 
V.l in [25]). Given y£l, pi,P 2 £ A(iL) and A 6 [0,1], Pi € B(pi,y) and P2 € &(p 2 ,y), let 
us construct Pa € B(\pi + (1 — X)p 2 ,y) as follows. Assume that (Z 1 ,^ 1 ) and (Z 2 ,7 t 2 ) are 
independent and of respective laws Pi and P2. Let £ be a random variable independent of 
(. Z 1 ,tt 1 ) and (Z 2 ,7 t 2 ) and such that P(£ = 1) = A and P(£ = 2) = 1 — A. Define {Z,tt) as the 
process equal to (Z*,7r*) on {£ = 1 }. It follows easily by conditioning on £ that 

POO POO POO 

E[ / re~ rt u(TT t , Yt)dt] = AE[ / re~ rt u('K\, Y^)dt] + (1 — A)E[ / re^ rt u(7r 2 , Y t 2 )dt], 

Jo Jo Jo 
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If we assume that (Z,tt) has a law Pa G £>(Api + (1 — X)p 2 ,y), then for any e > 0, we can 
choose Pi and P 2 as e-optimal probabilities so that 


/*oo 

V (Api + (1 - X)p 2 ,y) > 1E[ / re~ rt u(ir t , Y t )dt\ 

Jo 

/‘OO /*oo 

= AE[ / re _rt u(7r t 1 , Ft 1 )*] + (1 - A)E[ / re _rt ii(7r?, 

Jo Jo 

> AU(pi,y) + (1 - X)V(p 2 ,y) - e, 

and this proves that V is concave with respect to p as e can be chosen arbitrarily small. 

In order to conclude, it remains therefore to prove that (Z,tt) has a law Pa G B{Xp\ + (1 — 
X)p2,y)- Note at first that (Z t ) t > 0 has law Pa Pi +(i-a )p 2 ,y by construction. Moreover, Fj* ,Y is 

included in <r(£) V 7ff V Ff ,Y . Using independence, we have therefore for all k G K and 

all t > 0: 


-1 


= PpQ 1 = k\F Y '* ,F Y ’ w ,0lt =1 +F(X?\F t r ~'*\F; 

-Y 1 ' 7rl ' 


T 7 Y 2 ,7r 2 

•2 _ LI -t-Y 2 ,tt 2 ' 


= P(Xl = k\Ff - W )1 S=1 + P(X 2 = )l e =2 = ^(fc)le=l + vr t 2 (fc)l c=2 = 7 T t (k), 

and using the tower property of conditional expectations, we deduce that 

7T t (k) = P{X t = k\jj*’ Y) ). 

To prove the Markov property, let s > t and <fi some bounded continuous function on A(K) xR. 
As above, we have: 

EMZJIF ? 1 '= X>*=iE[^) 

i 

= j2h=in^zi)\Ff\ 

i 

=Y J M=i p s-tm z b = Ps-tmzt). 

i 

The conclusion follows by using the tower property of conditional expectation with the inter¬ 


mediate fj-field Ft 


□ 


4.1 Dynamic programming. 

Notation 4.2. In the following, we will use the notation E Pi2/ to denote the expectation 
associated to the diffusion process if starting at time 0 with initial position if>o = ( p,y). 

We now state a dynamic programming principle which will be the key element for the 
proof of Theorem 2.5. 


Proposition 4.3. For all (p, y) G A (K) x M, for all h >0, we have 

f h 

V(p, y) = max E[ / re~ rt u(iT t , Y t )dt + e~ rh V(n h , Y h )]. 

( 7 x,Y)&B(p,y) Jo 


(4.1) 
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As a consequence, 


(4.2) 


V(p,y) >E Pt y[ [ re rt u(i/) t )dt + e rh V(if t )]- 
Jo 

Moreover, if is an optimal process for V(p,y), then for all h > 0: 

f h 

V{p,y)= E[ / re~ rt u(TTt, Y t )dt + e~ rh V ( 7 ^, Yf,)\. (4.3) 

Jo 

Proof. We prove at first that the maximum is reached in the right-hand side of (4.1). Let us 
define the MZ-topology on the set B([0, h\, K x R x A (K)) as the convergence in Lebesgue 
measure of the trajectories together with the convergence of the value of the process at time 
h. Note that this topology coincides (up to an identification) with the induced topology on 
the subset of B([0, oo), K x R x A (K)) made by trajectories that are constant on [h, oo). 
Using this identification and adapting the arguments of Lemma 3.8, the set of laws of the 
restrictions of the processes (Z,tt) £ B{p,y) to the time interval [0, h] is £(MZ)-sequentially 
relatively compact in A(1ED( [0, h\,K xMx A (K))). The existence of a maximum follows since 
the map 

f h 

P € A(B([0 ,h],K x R x A (K))) —► E P [ / re~ rt u(ix t ,Y t )dt + e~ rh V(n h ,Y h )\, 

Jo 

is C(MZ) upper-semi-continuous. 

We now prove (4.1). We begin with a measurable selection argument. 

The function P € B(p,y) —> J(P) := E[J^° re~ rt u(7r t ,Yt)df\ is £(MZ)~ continuous, and the 
set-valued map (p, y ) — > B(p, y ) is £{MZ) upper-semi-continuous. We deduce that the subset 
O of the space A (K) x R x A(O([0, oo), {K x R) x A (AT))) defined by 

O ■= {(p,y, P)|P G B(p,y),J( P) > V(p,y)} 


is Borel-measurable (see Remark 3.6). Moreover, Lemma 3.8 implies that for any ( p,y ), there 
exists some P such that (p,y, P) £ O. It follows therefore from Von Neumann’s selection 
Theorem (see e.g. Proposition 7.49 in [2]) that there exists an optimal universally-measurable 
selection </> from A (K) x R to B(p,y) such that for all (p,y) € A (K) x R, 4>(p,y) £ O. 

Let (Z,n) £ B(p,y) and h > 0 and let fih denote the joint law of (^,1^). By construction, 
cf is /ife-almost surely equal to a Borel map f>. Using Lemma A.5, we can construct a pro¬ 
cess (7 T s ) s >h ( on some extension of the probability space) such that the conditional law of 
(Zf l+sl TTf l + s )s>o given is precisely W) and such that there exists a variable U, 

uniformly distributed on [0,1] and independent of (Z,tt), and a measurable map such that 

(Jr s ) s > h = H(Z s ) s > h , (7 r h , Y h ), £). (4.4) 


Let us consider the process (Z, 7 r) where 7 r is equal to 7r on [0, h ) and to 7 r on [h, 00 ). Using the 
preceding construction, if we assume that the process (Z,rr) has a law in B(p,y), we deduce 
that: 


V(p,y) > E[ / re rt ufn tl Y t )di\ 

Jo 

rh roo 

= E[ / re~ rt u( tp, Y t )dt] + e -^IE[3E[ / re~ rt u^h +t , Y h+t )dt\jf’ v) ]] 
Jo Jo 

rh 

—rt„. / 1 _— rh~\[ 


> E[ / re~ rt u(7 r t , Y t )dt\ + e~ rfl E[V(ir h , Y h )] 


/ 0 
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which would prove that 


rh 

V(p, y) > max E[ / re~ rt u(TT t , Y t )dt + e~ rh V(-rr h , Y h )]. (4.5) 

( Z,n)£l3(p,y ) JO 


To conclude the proof of (4.5), we now check that the process {Z,tt) has a law in J3(p,y). 

At first, note that {Z t )t> o is a Markov process with initial law p®5 y by construction. Let us 
prove that for all t > 0, TTt(k) = P(Xf = k\J-j: Y ’^]. The result is obvious by construction for 
t < h. For t > h, let us consider two finite families (t\,...,t m ) in [h,t] and (t \...., t' f ) in [0, h) 
and two bounded continuous function <^>, <fi' defined on (A (K) x M) m and (A (K) x M)^. Then: 


IE [1x t =k<l>(*t 1 , Y tl ,. 7T tm , Y tm ) 4? (t r t /, ,..., vr t ,, Y^)] 

= E[E[1 Yt=fc <X7f tl ,Y tl ,..., n tm , Y tm ) (t^ ,Y t[ ,..., 

= E [ffi (fc) 0(7Tti ,Y tl ,...,TT trn , Yt m )(j>' (7T t / , , ..., 7T f / , i*')] 

= E[iff (fc)^(7T tl , y tl ,..., 7T tm , U tm )<?/(vr^, ,..., 7T f /, y 4 /)]. 




This property extends to bounded measurable functions of any finite family (U) in [0, t] by 
monotone class and we deduce that TTt(k) = P(X t = k\ 

We now prove the Markov property. For t > 0, we have to prove that 


( Z s ) s >t (7Ts)se[0,t]- 

(Zs) s £[ 0,t] 


(4.6) 


The case t < h follows directly by construction. Let us consider the case t > h. 

At first, since the conditional law of (Z s ,n s ) s >h given ( 7 ^, Y^) belongs to Bfah, Yh), we have: 


(^s)se[/i,t] {Z a )s>t- (4-7) 

( z s) s e[h,t]Ji r h,Y h ) 

Using (4.4) and that Z is an J r ^ ,7r -Markov process, we also have 

s€.[h,t] ii (Zs,n s ) se[0A] , (4.8) 

(Zs) s e[h,t\ j{ n h^h) 

(^s)se[/i,t] (Z s , 7r s ) sg [o j /i]. (4-9) 

{ Z s)s>ti( Z s)s£[h,t] J’KhYh.) 

From the characterization of conditional independence in terms of conditional laws recalled 
in Lemma A.2, properties (4.7), (4.8) and (4.9) together imply that: 

(^s)se[/i,t] (Z s )s>t (4-10) 

(Zs JZs) s £[h,t]:(^h Yh) 

Using again the fact that Z is an J 7 ^ ,7r -Markov process, we also have 

(Z s )s>t ( 7r s)se[o,h] (4-11) 

(Zs)s£[0,t] 
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Finally (4.10) and (4.11) imply 


(Zs)s>t (( 7 r s)se[0,/i]) (^s)ae[ft,t]) 

(^s)a6[0,i] 


(4.12) 


from which we deduce (4.6) since (7Ts)se[o,t] is a function of ((7rs) s e[o,W) (ThOseiM]) • This 
concludes the proof of (4.5). 

In order to conclude the proof of (4.1), we now prove the reverse inequality. 

Let (Z, 7r) be an admissible process and h > 0. We check easily that the conditional law of 
{Zh+si^h+s)s>o given belongs almost surely to £>( 7 ^, Yjj). It follows that 

poo ph poo 

E[ / re- rt u(7T t ,Y t )dt] = E[ / re~ rt u{n u Y t )dt ] + e - rh E[E[ / re~ rt u(n h+u Y h+t )dt\jtf’ Y) ]] 


1 0 


>0 


< E[ / re- rt u{ir t ,Y t )dt} + e- rh E[V(TT h ,Y h )}. 

Jo 

The conclusion follows by taking the supremum over all admissible processes (Z,n). 

The inequality (4.2) follows directly from (4.1). Precisely, given a process Z with initial law 
p <8> S y , define 7 r by ir t (k) := Xt{k) = PpC = k\J ^ ,+ ] (optional projection). As explained 
before, (Z, n) has a law in B(p,y) and (7r,y) is a diffusion process of semi-group Q. 

We finally prove (4.3). If (Z,n) £ B(p,y) is an optimal process (existence of a maximum 
follows from Lemma 3.9), then using the same arguments as above, we have for all h > 0: 

poo ph 

V(p, y) = E[ / re~ rt u(ir t ,Y t )dt} < E[ / re~ rt u(n t , Y t )dt] + e~ rh E[V(Tr h , Y h )\, 

Jo Jo 

and the conclusion follows from (4.1). 

4.2 Proof of Theorem 2.5. 


□ 


Proof of theorem 2. 5. The proof is divided in two parts showing respectively that the lower 
semicontinuous envelope P* of V is subsolution and that V is supersolution of (2.4). Unique¬ 
ness and continuity will follow from the comparison result (Theorem A. 8) whose proof is 
postponed to the appendix. 

part 1: We prove that the lower semicontinuous envelope of V. denoted IA, is a supersolution 
of (2.4). 

Let f> be any smooth test function such that <f> <V* with equality in (p, y ) £ A (K) x R. As U* 
is bounded, we may assume without loss of generality that 4> is bounded. Consider a sequence 
(Pn,y n ) ( p,y) such that V(p n ,y n ) -£ V*(p,y). From (4.2), we deduce that 

v (pn, l In) - e~ rh E, Pn> y n [(fii’h)] - E Pn , yri [ [ re~ rs u(i/j s )ds] > 0. 

Jo 

Letting n —> 00, we obtain that (recall that if is a Feller process): 

j'h 

4>(p, y) - e~ rh E p ^ y [f)(if h )\ - E PtV [ / re~ rs u(if s )ds\ > 0. 

Jo 
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Applying Ito’s formula, we have 


E;>, 2 /[00/0] = </>(p,y)+®p,y[ [ A{4>){^ s )ds\. 

Jo 

Dividing by h and letting then h 0, it follows from usual arguments that 

r<i>(p, y) - A (<£)(p, y) - ru(p, y) > 0. (4.13) 

Let us prove that 14 is concave with respect to p. Let y G M and p = Api + (1 — \)p 2 for 
some p,pi,P 2 G A(A) and A G [0,1]. Let ( p n ,y n ) a sequence converging to (p,y) such that 
V(p n ,y n ) —> V*{p,y). Then, there exists p",P 2 e A(A") such that p n = Ap™ + (1 — A)p£ and 
(P 11 P 2 ) (pi,P 2 ) (it is for example a consequence of Lemma 8.2 in [23]). It follows that 

V(p n ,y n ) > AK(p?,y") + (1 - \)V(p%,y n ). 

By letting n —> 00 and using the definition of 14, we deduce that 


V*(p,y) > A14(pi,p) + (1 - A)14(p 2 ,p), 

which proves that 14 is concave. We deduce that A max (p, D 2 (f> p (p, y)) < 0, and together with 
(4.13) this concludes the proof of the supersolution property. 

part 2: We prove that V is subsolution of (2.4). 

Let 4> be smooth test function such that 4> >V with equality at z = (p, y). We have to prove 
that if A m ay (p. Dp(fi(z)) < 0, then rV(z) — A{cj)){z) — ru(z ) < 0. 

Using Proposition 4.3, let (Z, 7 r) G B{z) be an optimal process, so that for all h > 0, we have 

rh 

V(z) = E[ / re- rs u(ir s ,Y s )ds + e- rh V(ir h ,Y h )\. (4.14) 

Jo 

Since A max (p, Dp4>(z)) < 0 (see e.g. the proof of Theorem 3.3. in [5]), there exists 5 > 0 such 
that for all z = (p, y) with p G A (K) such that p — p G T’a(A')(p)> we have: 

V(z) < V(z) + (D p (/)(z),p — p) - 5|p-p| 2 . 

As E[7To] = p, the variable 7To belongs almost surely to the smallest face of A (K) containing 
p so that 7To — p G Ta(K)(P)- O n the other hand, Yq = y so that (4.14) with h = 0 implies 

V{z) = E[V(7ro,y)\ < V(z) - <iE[|7r 0 -p| 2 ]. 


We deduce that 7To = p almost surely. 

Recall the definition of the process x as an optional projection: 


Vk G K, Vs > 0, Xs (k) = P(A, = k\Tj’ + ). 


Lemma A.6 implies that TT s (k) = P(A S = k\ ^’ + ), and we deduce that E[7r s |J r s ’ + ] = 
Xs using the tower property of conditional expectations. Coming back to (4.14), Jensen’s 
inequality implies: 



rh 

re~ rs u(ir s , Y s )ds + e~ rh V(n h ,Y h )] < E[ / re~ rs u(n s , Y s )ds 

Jo 


+ e- rh V( X h,Y h )]. 
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Since V < (f>, we obtain 


fh 

V(z) = fi{z) < E[ / re~ rt u(TT s ,Y s )ds + e~ rh (j)(xh, ift)]- 
■Jo 

Dividing the above inequality by h, and letting h go to zero, it follows from the usual arguments 
(using that tt s —> 7To when s —> 0, and Ito’s formula) that: 

rV(z) — A(</))(z) — ru(z) < 0. 


□ 


A Technical Proofs and auxiliary tools. 

A.l Proofs of Lemma 3.3 

Let us now recall some properties of conditional independence. As we will manipulate condi¬ 
tional laws, we introduce a specific notation in order to shorten statements and proofs. 

Notation A.l. Let E be a Polish space and A be an E-valued random variable defined on 
some probability space (P, A, P). 

• [A] denotes the law of A. 

• Given a o-field E C A, [ A | J 7 ] denotes a version of the conditional law of A given E, 
hence an E-measurable random variable with values in A (E) (see e.g. [2] Proposition 
7.26 for this last point). 

Lemma A.2. 

• Let A,B,C be three random variables (with values in some Polish space) defined on 
the same probability space. A is independent of B conditionally on C if and only if 
[B\C\ = IB\C,A\. 

• A\\ c B if and only if there exists (on a possibly enlarged probability space) a random 
variable £ uniform on [0,1] independent of (A, C), and a measurable function f such 
that B = f(C,£). 

Proof. See Proposition 5.6 and 5.13 in [22], □ 

Proof of Lemma 3.3. The “if” part is obvious. Let us prove the “only if” part. For q = 0, this 
is just Lemma A.2. However, we need to be more precise on how to construct this variable. 
We assume that there exists a family of independent variables (Co; , Cn) uniformly distributed 
on [0,1] and independent of (Aq,Bq, ..., A n , B n ). Then, the variable £o given by Lemma A.2 
can be constructed as a function of (Ao, Bq, Co) (see the proof of Proposition 5.13 in [22]). Let 
us now proceed by induction and assume the above property is true for p < q and that is 
measurable with respect to (Ao, Bq, Co, A p , B p , ( p ). Since 

(Ao, ••., A q+ i) 11 (Bq,...,B n ), 

(B 0 ,...,B q+ i) 

we have 

{Bo, B n \B 0 ,..., B q+ i,A 0 ,..., A q+ 1] = [Ho,..., B n \B 0 ,..., H 9+ i]. 
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We deduce that 


{Bq, •••, B n \B 0 ,..., B q+1 ,A q+ ij — p? 0 ,..., B n \B 0 ,.., B q+1 j. 

Using now the induction hypothesis and independence, we also have 

{B 0 , ...,B n \B 0 ,...,B q+ i,£ 0 , ...,£ g } = lB 0 ,...,B n \B 0 ,...,B q+1 l 

{Bo, B u \Bq, ..., -Bq+l, £oj ■■■Aqi ^4g+ll = IB 0 , • B n \Bo, BqJt- 1 , Aq +1 J . 

Finally, we deduce that A q+ \ JI ( -^ (J ^ Bq b ^ + _^{Bq, ..., B n ) and the result follows then by 

applying Lemma A.2. □ 

A.2 Auxiliary Tools 

The following lemma is classical. 

Lemma A.3. Let E,E' be two separable metric spaces and A,A! two tight (resp. closed, 
convex) subsets of A(E) and A (E 1 ). Then the set V(A,A') of probabilities on E x E' having 
marginals in the sets A and A' is itself tight (resp. closed, convex). 

Proof. Let us prove the tightness property. Let p £ A, u £ A' and ir £ V(p,i'). By as¬ 
sumption, for any e > 0 there is a compact K s of E, independent of the choice of p in A, 
such that p(E/K e ) < e, and a compact K ' e , independent of the choice of v in A! such that 
v{E'/K' e ) < e. Then for any pair of random variables ( U , V ) of law ir: 

mu, V)£K e x L e ] < P [U i K £ \ + F[V i L e } < 2e 

The closed and convex properties follow directly from the continuity and linearity of the 
application mapping ir to its marginals. □ 

The following theorem is well-known and allows to construct variables with prescribed 
conditional laws. 

Theorem A.4. (Blackwell-Dubins [3]) 

Let E be a polish space with A (E) the set of Borel probabilities on E,and ([0,1],£>([0,1]), A) 
the unit interval equipped with Lebesgue’s measure. There exists a measurable mapping 

<J> : A (E) x [0,1] —> E 

such that for all /r £ A (E), the law of <!>(//, U) is p. where U is the canonical element in [0,1]. 

In the proof of Proposition 4.3, we use indirectly this result together with a disintegration 
theorem. Precisely: 

Lemma A.5. Let E, F be Polish spaces, (II, A, P) be some probability space, Y be an E-valued 
random variable defined on O, and T a sub-a-field of A. Assume that f is an T-measurable 
map from II to A (E x F ) such that the marginal fi(cv) £ A (E) of f(cu) on the first coordinate 
is a version of the conditional law ofY given E. Then, (up to enlarging the probability space, 
there exists a random variable Z such that f(oj) is a version of the conditional law of (Y, Z) 
given T. 
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Proof. Up to enlarging the probability space, we may assume that there exists some random 
variable U uniformly distributed on [0,1] and independent of (Y,F). One can define using 
Theorem A.4 a variable ( Y,Z ) = 3>(/(w), £/) having the property that fi(uj) is a version of 
the conditional law of Y given T. Let g(u>,Y) be a version of the conditional law of Z given 
(P,Y), it follows easily that Z = $(g(ui,Y),U) fulfills the required properties. □ 

The next Lemma is a generalized martingale backward convergence theorem directly 
adapted from the corresponding result for classical forward martingales that can be found 
in chapter III of [25]. 

Lemma A.6. Let ( X n ) n >o be an uniformly bounded sequence of real-valued random vari¬ 
ables defined on some probability space (U,A,P). Let (iF n ) n >o be a decreasing sequence of 
sub a-fields of A. Assume that (X n ) n >o converges almost surely to some variable X, then 
(E[A n |J r n ]) n >o converges almost surely to E[X| 

Proof. Define Xff = sup m>n X m and Y+ = E[X+|J^,]. The sequence A'+ is non-increasing 
with limit X and we have 

Y+ +1 = E[X+ +1 \F n+1 ] < E[X+\F n+1 ] = E[Y+\A n+1 }. 

Y+ is therefore a backward sub-martingale and converges almost surely to some variable Y + 
(see e.g. Theorem 30 p.24 in [13]) which is P| n>0 J^-measurable. Similarly, define X~ = 
inf m>n X m , and Y~ = E[X“ \P„\- Then Y~ is a backward supermartingale which converges 
almost surely to Y~. To conclude, note that 

Y~ < E[A| X n ] < Y+, 

and that Efl^ — Y~] = E[X+ — X~\ — > 0 by bounded convergence. Since Epf]^] con¬ 
verges almost surely to E[X| Dn>o-^]> we deduce that E[A n |J r n ] converges almost surely to 
E[A| n„>o-^n] as Y- < nX n \T~] < Y+. " □ 


A.3 comparison 

In this section we adapt the comparison principle given in [9] for super solutions and sub 
solutions of (2.4). 

Remark A.7. Note that the process x takes values in A (K), and that our assumptions on b 
and a imply that the functions c and k are Lipschitz continuous and bounded on A (K) xl. In 
the following, we will assume without loss of generality that the functions c and k are bounded 
and Lipschitz on the whole space M A+1 (the explicit formula cannot be used directly since the 
resulting functions would be unbounded and only locally Lipschitz). Similarly, we assume that 
the function u is bounded and Lipschitz on the whole space M A+1 . 

With our assumptions on c and k, it is well known (see e.g. [11], p.19) that there exists 
a constant C (depending on the Lipschitz constants of c, k, u ) such that for any rj > 0, 
z, z' € A (K) xl, (e M a+1 and symmetric matrices 5, S' € S K+1 with 





we have 

| u(z) — u(z') | < C\z — z' 
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MzM)-(b(z'te)\<C\t\\z-z'\, 

-S'k(z')) < ~(k{z), Sk(z )) + Cr)\z - z !| 2 . 

Let us state the comparison principle. 

Theorem A.8. Let w\ be a subsolution and w 2 be a supersolution of (2.4), then w± < W 2 - 

The rest of this subsection is devoted to the proof of this result. Let w\ be a subsolution 
and W 2 be a supersolution of (2.4) (recall that w\,u )2 are bounded functions). Our aim is to 
show that w\ < W 2 - We argue by contradiction, and assume that 

M := sup {wi(z) — W 2 (z)} > 0. (A.l) 

zgA(K)x«. 

Because of the lack of compactness, let f3 > 0 and g(y) := y/(1 + y 2 ). Define 

M' := sup {wi(z) - w 2 {z) - 2/3g(y)} . 

z&A(K)xR 

We choose (3 sufficiently small so that M' > 2G f® > 0 with C\ = 11^11^ + ||&j|oo- 

We first regularize the maps w 1 and w 2 by quadratic sup and inf-convolution respectively. 
This technique is classical (see [11] for details), for 5 > 0 and z £ R A+1 we define: 


w\(z) 

:= max < 
z'eA(K)xR 1 

w,(z') 

1 1 'id 

-yF- z] ) 

W2 ,s(z) 

:= min 
z'GA(X)xI 1 


+ is |2 ~ 2 ' |2 j 


Note that wf and 102,3 are defined on the whole space R A+1 and that wf is semiconvex while 
W 2 ,s is senriconcave. Moreover, we have the following growth property (uniformly in y) 

lim \p\~ l w[(p,y) = - 00 , lim \p\~ 1 w 2! s(p,y) = +00 . 

|p|—H-00 |p|-H-oo 

Define (with z l = ( p l ,y *)): 


Ms := sup { wfiz 1 ) - w 2 ,s(z 2 ) ~ MV 1 ) + <?(j/ 2 )) - hz 1 ~ z 2 \ 2 ) . (A.2) 

z 1 ,z 2 e I 2d J 

The following result is classical. 

Lemma A. 9. For any 5 > 0, the problem (A.2) has at least one maximum point. If (zl, z 2 ) 
is such a maximum point and if (z 1 )^ £ A(iL) x R and (z 2 )" £ A (K) x R are such that 

™i(4) = w i({ zl y s ) --^:\z} - (z'Ysl 2 and w 2 ,s{zs) = w 2 ((z 2 )s) + ^\z$ - {z 2 )s\ 2 (A.3) 
then, as 6 —> 0, Ms —>• M' while 


zi - z. 
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2 y /|2 
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s I | isj-W ! Ws-VTs 
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25 


0. 
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We first prove that the regularized sub/supersolutions are sub/supersolutions of sligthly 
modified equations. 

Lemma A. 10. Assume that wf has a second order Taylor expansion at a point z. Then 

min{ru;i(z) + H(z', Dwf(z), D 2 w](z)) ; -A max (p', D 2 wf ( 2 ))} < 0, (A.4) 

where z' € A(AT) x R is such that wf(z) = wi(z') — ^\z — z'\ 

Similarly, if w 2j s has a second order Taylor expansion at a point z, then 

rw 2 (z) + H(z", Dw 2 ,s(z), D 2 w 2 , 5 (z)) >0, (A.5) 

where z" £ A(A") x R is such that w 2j s(z) = w 2 (z") + ~ z"\ 2 

Proof. We do the proof for wf, the second part being similar. Assume that wf has a second 
order Taylor expansion at a point z and set, for 7 > 0 small, 

z) := ( Dwf{z),z - 2 ) + ^(2 - 2 , D 2 wf(z)(z - 2 )) + || 2 - 2 1 2 . 

Let z! denote a point in A (AT) x R such that wf{z) = wi(z') — ^\z — z'\ 2 . 

Then wf — has a maximum at 2 , which implies, by definition of wf, that 

wi(z') ——I z' — 2 I 2 < (j)-y(z) — 4>-y(z) + wf(z) V 2 £ R /c+1 , V z' £ A (AT x R, 

2 0 

with an equality for ( 2 , 2 ') = ( 2 , z'). If we choose 2 = z' — z! + 2 in the above formula, we 
obtain: 

UJl(z') < (fry {jf — z’ + 2 ) + — \z' — z\ 2 — (fry{z) + wf (z) V ' z' £ A ( K ) X R, 

ZjO 

with an equality at z' = z'. As uq is a subsolution, we obtain therefore, using the right-hand 
side of the above inequality as a test function, 

min {rwi(z') + H(z',D<f>ry(z),D 2 (fry(z)) ■ -\ m ^{z', A> 2 ^ 7 ( 2 ))} < 0. (A. 6 ) 

By construction, we have Df^ifz) = Dwf{z), D 2 f>ry{z) = D 2 wf{z) + 7 / and uq( 2 ') > wf(z). 
The conclusion follows therefore by letting 7 —> 0. □ 

In order to use inequality (A.4), we have to produce points at which ref is strictly con¬ 
cave with respect to the first variable. For this reason, as in [9], we introduce a additional 
penalization. For a > 0 and z l = ( p l ,y l ) £ R A+1 , we consider 

M 5}(T := sup [wf(z l ) - w 2 ^z 2 ) - P(g(y l ) + g(y 2 )) + crg^p 1 ^ - ^z 1 - z 2 \ 2 \ . 
(z 1 ,z 2 )e(R* r + 1 ) 2 l 26 J 

One easily checks that there exists a maximizer ( 2 1 , z 2 ) to the above problem. In order to use 
Jensen’s Lemma (Lemma A.3 in [11]), we also need this maximum to be strict. For this we 
modify the penalization: we set for i = 1 , 2 : 

Cib 1 ) = iKb 1 !) - crg(\p l - p 1 |), c i{y l ) = -Pg(y l ) - °g{y l -y 1 )- 
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We choose a > 0 sufficiently small so that £1 has a positive second order derivative. By 
definition, 


M s , a = sup \wiiz 1 ) -w 2 ,s(z 2 ) + Ci(y 1 ) + C 2 (y 2 ) + o'6(Ip 1 I) - ttf^ 1 ~z 2 \ 2 \ , 

(zV 2 )e(R K + 1 ) 2 L /d J 

and the above problem has a strict maximum at ( z 1 ,z 2 ). As the map (z^z 2 ) —» wf^z 1 ) — 
W 2 ,s(z 2 ) + Ci(2/ 1 ) + C 2 (y 2 ) + — ^s\ zl ~ z ‘ 2 is semiconcave, Jensen’s Lemma (together 

with Alexandrov theorem) states that, for any e > 0, there is vector a e £ (R A+1 ) 2 with 
|a e | < e, such that the problem 

M 5) a,e ■= sup {^(z 1 ) -u; 2) ,5(z 2 ) ACCy 1 ) + C2(y 2 ) + ct6(Ip 1 |) - 77?- z2 l 2 + (as,( z1 ^ 2 )) 
z 1 ,z 2 e( i^+L 2 l Zd 

has a maximum point [z\ az i z 2 ae ) £ (R A+1 ) 2 at which the maps wf and W 2 ,s have a second 
order Taylor expansion. From Lemma A. 10, we have 

min{rw 1 {z\^ £ ) + H{{z 1 )' 5 ^ £ ,Dw{{z\^ £ ),D 2 w{{z \ (TjE )) ; -A max ((z 1 )^ e , D 2 p w{ (z^ £ ))} <0, 

(A.7) 

and 

™> 2 ( z i<j, £ ) + H{{z 2 )'^ e , Dw 2 ,s( z la, £ ), D 2 W 2 ,s{z\^ £ )) > 0, (A.8) 

where {z 1 )' SrT£ and (z 2 )" a£ are points in A (K) x R at which one has 

w l( z i*,e) = and W zA Z ia,e) = ™ 2 (( 2 2 )£ CT)£ ) + ^|zf^ £ -(z 2 )£ 

Using the properties of inf and sup-convolutions, we have: 

Dw l( z \<r,e) = ( z 5,*,e ~ i^Ys^e) and Dw 2A Z l*,e) = J ( z S,a,s ~ (*?)'£<r,e) ■ ( A - 9 ) 

By definition of Ms, a ,e we have for all (z 1 ,^ 2 ) £ (R^" 1 " 1 ) 2 : 

wfiz 1 ) - w 2 ,s(z 2 ) + Cl(y 1 ) + C2(y 2 ) + o-Ci(yi) < M 5ifTi£ + ^Iz 1 - z 2 | 2 - (a £ , (z 1 ,z 2 )), 
with an equality at (zg a£ , z 2 ae )- Hence 


2\n 2 

<J,£ I * 


^lO^ae) + f o // l^'^l "A ) = l( Z 6ae ~ Z S a e) ~ a l 

V -09 (yi,a,J - (ys,a,e -y ) J 5 

- Dwi{zl^ £ ) + ( _ CT5 '(y 5 2 ff , £ - y 1 ) ) = " z ^’ e) _ c 


while 


with 


5 0\ If I -I 
0 S' - 5 \-I I 


S := D 2 w\{zl^ £ ) + ( 


- 0g"m,a,e ) - a g" (ys, a, e - y ) 


(A.10) 

(A.ll) 

(A.12) 
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£« - f) 


s'-.- z)V,( 4)+( 0 -M'iyLJ vg"(y, 

This implies that S < —S' (see [11] p.19) and therefore 

D l w l ( z S,a,e) < D l w 2 A z lc,e) ~ ° D2 tl (Ptf.tr.e)- ( A - 13 ) 

We now check that A max (((p 1 )^ 0 . e ), D 2 w{ (z}^)) < 0. Using the definition of W 2 t s, for all 
p G M a and p" G A (AT), 

,s(p,yla,e) < W2 (/, ( ^)s , a , e ) + ^ (|P -i>T + “ (Z/ 2 )tf,<r,e| 2 ) > 

with an equality at (p,p") = (p^, (p 2 )", ff ,e)- If m G T A ( K )((p 2 )'^ £ ) with |m| small enough, 
taking p := p 2 a £ + m and p" = (p 2 )” a £ + m gives 


w 2,s(pla,e+ m ^yla,e) < w 2((p 2 )s,a,e+ m , ( y 2 )s,a,e) + 7^ {\Ps, a ,e ~ (p 2 )s,a,e ? + \y 2 ^ e ~ (l/ 2 )* I<T , e | 2 ) , 

(A. 14) 

with equality for m = 0. As W 2 is concave with respect to the first variable (see e.g. Lemma 
3.2 in [5]), the above inequality implies that X max ((p 2 Ys,a,ei ^p w 2 ,s(,^s,< 7 ,e)) — 0- I n view of 
(A. 13) we get therefore 

X mSLX ((p\ a:e , < —crA min ((p 1 )^ a . e , D 2 ^\{p\ a£ )) < 0, 

because D 2 ^\ > 0 by construction. So (A.7) becomes 

rw l( z la,s) + H (( z \a,s)^ Dw l(4,a,s)^ D2w l( z ia,e)) < 0 
We compute the difference of the two inequalities (A.15) and (A.8) above: 


(A.15) 


r { w l{ z \a,e) ~ w 2A Z \a,e)) + HD ™1 (4,<r, e )> ( z S,a,eY 

- H ((z 2 Yi*, £ , Dw 2A z i*,e)’D 2 w2,s{zl a!£ )) < 0, 

where, in view of (A.9) and the definitions of S,S' (and using that g',g" and |-D£i|, |D 2 £i| 
are bounded by 1) 

H (( zl Ys,a,e^ Dw l( z 5,<T,e), D2w l( z 6,a,e)) > H ({ zl )s,a^\( Z \o,e ~ z S,a,s )> S ) ~ C l(P + £ + a), 


H (( z 2 yia,e, Dw 2A z la, £ )’D 2 W2,s(zl aE )) < H^Z 2 )"^, -(z^e “ z 6 ,*,e)’ ~ 5 ') + Cl (P + £ + A, 

Next, we have: 

H(^)W) - AiAYU )I < c \(. zl Ys,a,s - A 2 YU I, 


KMt-Av.e). l( s s,a,e - z hr,e)) ~ (H^Ys.o.e). j( z lv,e ~ 2 V,e))l ^ T K^Ys.o.z ~ {^Ys.a.zW z 6,a,e “ z< 5,c 


-l(«((2 2 )v, e ). -S'n((zX„.e)) < -5(«((^ 1 )w),S«((^ 1 )v, e )> + j\( Z X„ - ( Z X,. 


v 2y/ 2 
r ,el ■ 
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We deduce that: 


r(w{(ps,^) - «>2,i(p«, ff ,e)) < c Q l^ 1 )^ - (z 2 )" j(Ti£ | 2 + l^ 1 )^ - (* 2 )a I<r , e |) 


(7 


+ yl( zl )5,cr,e ~ { z2 )s,(T,e\\ z \a,e ~ z S,tr,e I + 2Ci(/3 + £ + <t). 


As it and e tend to 0, the z] (T£ , z 2 a£ , ( zl )' S(J£ and {z 2 )" a£ converges (up to a subsequence) 
to z t 5, z 2 , (z 1 )^ and (z 2 )g, where (z$,z 2 ) is a maximum in (A.2) and where (z 1 )^ and (z 2 )g 
satisfy (A.3). The above inequality together with the definition of Mg implies: 


rMg < r(w[(zj) - w 2 , 5 (z 2 )) <C Q [(z 1 )^ - (z 2 )"| 2 + ((z 1 )^ - (z 2 )"|^ 

+ ~J \( zl )6 — (^ 2 )511"5 — z 5 I + 2C'i/3. 

We finally let 5 —> 0: in view of Lemma A.9 the above inequality yields to rM' = lining rMg < 
2C\/3, which contradicts our initial assumption. Therefore w\ < w 2 and the proof is complete. 
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